Benchmarking Question Answering Systems
نویسندگان
چکیده
The need for making the Semantic Web better accessible for lay users and the uptake of interactive systems and smart assistants for the Web have spawned a new generation of RDF-based question answering systems. However, comparing the quality of these systems, repeating the published experiments or running on the same datasets remains a complex and time-consuming task. Thus, we extended the GERBIL benchmarking framework to support the fine-grained evaluation of question answering systems. In this paper, we describe the evaluation paradigm underlying our extension. In addition, we present the current implementation of the solution including different measures, datasets and preimplemented systems as well as possibilities to work with novel formats for interactive and non-interactive benchmarking of question answering systems. One particular feature of our framework lies in its provision of diagnostics, through which developers are provided with insights pertaining to the weakness and strengths of their systems. Therewith, we provide an open benchmarking suite that can potentially speed up the development of future systems.
منابع مشابه
A Survey of Datasets for Biomedical Question Answering Systems
The massively ever increasing amount of textual and linked biomedical data available online poses many challenges for information seekers. So, the focus of information retrieval community has shifted to precise information retrieval, i.e. providing exact answer to a user question. In recent years, many datasets related to Biomedical Question Answering (BioQA) have emerged which the researchers ...
متن کاملOptimizing question answering systems by Accelerated Particle Swarm Optimization (APSO)
One of the most important research areas in natural language processing is Question Answering Systems (QASs). Existing search engines, with Google at the top, have many remarkable capabilities. But there is a basic limitation (search engines do not have deduction capability), a capability which a QAS is expected to have. In this perspective, a search engine may be viewed as a semi-mechanized QA...
متن کاملA New Statistical Model for Evaluation Interactive Question Answering Systems Using Regression
The development of computer systems and extensive use of information technology in the everyday life of people have just made it more and more important for them to make quick access to information that has received great importance. Increasing the volume of information makes it difficult to manage or control. Thus, some instruments need to be provided to use this information. The QA system is ...
متن کاملEvaluating question answering over linked data
The availability of large amounts of open, distributed and structured semantic data on the web has no precedent in the history of computer science. In recent years, there have been important advances in semantic search and question answering over RDF data. In particular, natural language interfaces to online semantic data have the advantage that they can exploit the expressive power of Semantic...
متن کاملOverview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving
We introduce the organization of the Todai Robot Project and discuss its achievements. The Todai Robot Project task focuses on benchmarking NLP systems for problem solving. This task encourages NLP-based systems to solve real high-school examinations. We describe the details of the method to manage question resources and their correct answers, answering tools and participation by researchers in...
متن کامل